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(54) Memory system and data transfer method 

(57) A DRAM system that can prevent a substantial 
reduction in bandwidth with respect to a clock pulse fre- 
quency even when banks are accessed in no specific 
order so that a seamless operation is assured not only 
for reading but also for writing. A prefetch mechanism 
is used for reading and writing data to a separate mem- 
ory array at an early stage, so that the activation and 
precharge operation, which must be performed before 
reading the next set of data from the memory array, does 
not affect nor cause any deterioration of access speed. 
The amount of data prefetched is twice as much as that 
fetched in the period represented by an array time con- 
stant so that in a single bank structure a seamless op- 
eration can be performed both for reading and for writ- 
ing, even when row accesses are performed. 
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Description 

The present invention relates to an innovative op- 
eration and architecture for a DRAM system (a memory 
array system constituted by DRAMS). More specifically, 
the present invention relates to an innovative operation 
and architecture for a DRAM system that fully utilises 
the high-bandwidth capability and enables high speed 
processing. 

With DRAMs, an inexpensive memory system hav- 
ing a large memory capacity can be constructed by us- 
ing an extremely simple structure. DRAMS are : there- 
fore, the optima! selection for a memory device to be 
used in a computer system. The transfer speed of 
DRAMS (called the bandwidth, generally represented 
as a product of a data width and a clock rate) is slower 
than that of SRAMS, another memory device. The band- 
width of DRAMS cannot keep up with the recent en- 
hancement of the speed of MPUs, and becomes one of 
the barriers to improving the performance of a computer 
system. Conventionally, various ideas have been pro- 
vided to improve the bandwidth of DRAMS. 

Examples are synchronous DRAMS (SDRAM) and 
rambus DRAMS (RDRAM), which adopt a system for 
reading/writing consecutive address data in synchroni- 
sation with a high speed clock. In a system that uses a 
high speed clock, theoretically, its input/output section 
can be operated at 100 to 250 MHz (SDRAM) or at 500 
to 600 MHz (RDRAM), which is the operational speed 
of the clock. However, an activation and precharge op- 
eration for a memory array is required for a memory ar- 
ray that is connected to the input/output section. As a 
result, the bandwidth of the entire memory system, in- 
cluding the input/output section and the memory array, 
is drastically reduced. For example, when a clock of 200 
MHz is employed for SDRAM and when data width is 16 
bits (two bytes), a specific value for the bandwidth can 
be 400 MB/s (400 MB per second). However, if the ac- 
tivation and precharge operation for the memory array 
is included, the bandwidth is reduced to about one third, 
1 46 MB/s. This is due to the fact that two array activation 
and precharge operations are required to read/write 
4-bit consecutive data, and a period equivalent to 22 
clock cycles is spent for these operations. The same 
thing can be applied for RDRAM. A high speed clock 
cycle of 500 MHz can not be employed effectively, and 
the actual operating speed is reduced to 25% to 40%. 
For RDRAM, if a hit miss occurs, an extremely long time 
(e.g., 140 ns) is required, and the bandwidth is drasti- 
cally reduced. 

As is described above, the main factor related to the 
reduction of the bandwidth in a system is the time re- 
quired for the activation and precharge operation of the 
memory array. In the above system, therefore, multiple 
banks (memory array blocks) are prepared and the ac- 
tivation and precharge operation is performed for each 
bank, independently The activation and precharge op- 
eration for one bank is being performed while another 



bank is being accessed, so that the period required for 
the activation and precharge operation is hidden and the 
bandwidth is improved. A specific example of such a 
system is the SyncLink system (NIKKEI MICRODEVIC- 
s ES, August 1995, p. 152). This system independently 
performs data reading and writing for a memory array 
that is divided into multiple banks. With this system, 
however, while a seamless operation is ensured when 
different banks are sequentially accessed, a seamless 
io operation cannot be provided when the same bank is 
accessed continuously. This being taken into account, 
the average data rate is considerably reduced. 

To increase processing speed using the conven- 
tional methods, it is always premised that different 
is banks will be sequentially accessed. This is because 
when a specific bank is accessed and continues to be 
accessed thereafter, the performance of the activation 
and precharge operation for memory cells in that bank 
is still required and this processing cannot be hidden. It 
20 is well known that data accesses are not always per- 
formed alternately for the other banks. Therefore, the 
above described method, which can be called an alter- 
nate bank access system, does not provide an effective 
resolution for the problem. In addition, the provision of 
2S multiple banks adversely affect tho installation and prod- 
uct inspection costs, which are not acceptable. 

To overcome the shortcoming of the method where- 
by multiple banks are provided, the present inventors 
disclosed, in "A Full bit Prefetch Architecture For Syn- 
30 chronous DRAMs" (IEEE JSSC, Vol. SC-30, No. 9, Sep- 
tember 1995, pp. 998-1005) and in Japanese Unexam- 
ined Patent Publication No. Hei 07-283849), a system 
whereby a local latch that is provided for each set of 32 
bit lines, of 256 bit lines connected to a memory array, 
35 latches a total of eight data bits, and whereby the eight 
local latches are connected to a local buffer to perform 
a burst series transfer of data. This reading mechanism 
is called a prefetch system, because the data are 
fetched in advance to the local latches. For data reading 
40 from SDRAM this system can satisfactorily compensate 
for a reduction in the bandwidth, and can provide a 
seamless operation (provides a condition where there 
are no unnecessary clock cycles between data trans- 
fers). However, with this system a seamless operation 
45 is not possible during data writing. 

The present inventors disclosed, in "A Full Bit 
Prefetch DRAMS Sensing Circuit" (IEEE JSSC, Vol. SC- 
31, No. 6, September 1996, pp. 762-772), a configura- 
tion whereby a full burst of read data is latched by an I/ 
50 o sense amplifier by a single CAS access. With this con- 
figuration, the precharging can start two clocks before 
the data burst cycle begins. Since the precharging can 
be performed early, during the burst reading of preced- 
ing data, the subsequent RAS and CAS accesses can 
55 be performed. When eight bits are employed as a burst 
length, seamless reading can be performed even 
though the same bank is accessed. With this method, 
however, the seamless writing cannot be ensured. 
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It is therefore a first object of the present invention 
to provide a DRAMS system that, with respect to clock 
pulse frequency, prevents substantial reduction of band- 
width. That is, the object is to provide a memory system 
that has a bandwidth the equivalent of that for an input/ 
output circuit. 

It is a second object of the present invention to pro- 
vide a memory system constituted by DRAMS, with 
which the first object is achieved, even when banks are 
accessed in no specific order. 

It is a third object of the present invention to provide 
a memory system constituted by DRAMS whereby a 
seamless operation is assured not only for reading but 
also for writing. 

It is a fourth object of the present invention to pro- 
vide a function that can seamlessly perform simultane- 
ous read and write operations. 

It is a fifth object of the present invention to provide 
a DRAMS memory system that achieves the first 
through the fourth object by employing an improved 
prefetch system. 

To achieve the above objects, according to the 
present invention, a prefetch mechanism is applied for 
writing data to separate memory array at an early stage, 
so that the activation and prccharge operation, which 
must be done before reading the next set of data from 
the memory array, does not affect nor cause any dete- 
rioration of access speed. An amount of data is 
prefetched that is twice as much as that fetched in the 
period represented by an array time constant, so that in 
a single bank structure a seamless operation can be 
achieved both for reading and for writing, even when ac- 
cesses to any row addresses are involved. 

More specifically, according to the present inven- 
tion, a memory system comprises: a memory array con- 
sisting of multiple memory devices, an input data path 
for inputting external data, an output data path for ex- 
ternally outputting data, an input data bit storage mech- 
anism located between the memory array and the input 
data path, and an output data bit storage mechanism 
located between the memory array and the output data 
path and is characterised in that data bits read from the 
memory array are held in the output data bit storage 
mechanism for external output across the output data 
path and an activation and prefetch operation is effect- 
ed, which is required for a following data reading from 
the memory array. In addition, according to the present 
invention, by using such a memory system, data bits are 
transferred in advance from a memory array to an output 
data bit storage mechanism and a first burst output is 
performed. During this period, an operation required for 
a following data reading from the memory array is exe- 
cuted, and a next read address is assured. Then, more 
data bits are transferred from the memory array to the 
output data bit storage mechanism to perform a second 
burst output seamlessly, relative to the first burst output. 
In a write mode, external data bits are stored in input 
data bit storage means. The procedure at this step can 
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be performed without being affected by the timings for 
the first and the second burst outputs. This is because 
the input data bit storage means and the output data bit 
storage means can be operated independently. 

How the invention may be carried out will now be 
classified by way of example only and with reference to 
the accompanying drawings in which : 

Fig. 1 is a block diagram illustrating a memory sys- 
tem according to the present invention; 
Fig. 2 is a more detailed specific diagram illustrating 
the memory system of Figure 1 . 
Fig. 3 is a timing chart for a read operation per- 
formed by the memory system according Figures 1 
and 2. 

Fig. 4 is a timing chart for a write operation per- 
formed by the memory system according to Figures 
1 and 2. 

Fig. 5 is a timing chart for the processing of a prior 
art SDRAM system. 

Fig. 6 is a timing chart for the processing of the 
memory system according to Figures 1 and 2. 

The condition for implementing a seamless opera- 
tion is the condition that exists when a sum of a RAS- 
CAS delay (t RCD ) and a RAS precharge time (t RP ) is 
smaller than a burst length (LB), i.e., 



1 rcd + l RP < L B 



0). 



These times are actually represented by a number 
of clock cycles. Even when data accesses are per- 
formed between any row addresses at this time, a seam- 
less operation can be assured. That is, even when an 
activation and precharge operation for a memory array 
is included, continuous reading/writing can be per- 
formed. 

Since t RCD + t RP on the left side of the above ex- 
pression represents the minimum period of time re- 
quired for accessing the memory array, this value is 
called an array time constant. Assuming that reading 
and writing are alternately performed during regular 
processing, the actual condition for the seamless oper- 
ation can be defined as: 



2(t 



ROD + 1 RP) < L B 



(2). 



To provide the above condition, according to the 
present invention, in the memory system constituted by 
a DRAM, the memory array and the data input/output 
circuits are connected together by a latch, etc., and their 
operations can be separated. Further, the input circuit 
and the output circuit are provided separately to ensure 
independent operations, and reduction of the bandwidth 
does not occur while reading and writing are repeated 
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alternately. 

Fig. 1 is a specific diagram illustrating one embod- 
iment of the present invention. Memory arrays 1 and 2 
are connected to an input circuit 20 and an output circuit 
30 via a read/write latch 10. The input circuit 20 and the 
output circuit 30 are connected to external devices via 
a receiver and an input pin 21, and via an output buffer 
and an output pin 31 respectively Four input latches 26, 
27, 23 and 29 are provided for data input andfour output 
latches 46, 47, 48 and 49 are provided for data output. 
These input latches are constituted by multiple input da- 
ta paths 22, 23, 24 and 25, and the output latches are 

constituted by multiple output data paths 32 33 34 and 
35. 

Fig. 2 is a detailed diagram illustrating the circuit 
shown in Fig. 1 . The read/write latch 10 is actually sep- 
arated into read latches 12 and write latches 11. Each 
set of four latches is connected to one of the data paths 
22 to 25, and the data paths 32 to 35. In this embodi- 
ment, since four data paths are employed for reading 
and four for writing, and since four latches are connect- 
ed to each data path, sixteen write latches and sixteen 
read latches are distributed among 256 bit lines. In other 
words, one latch is provided for every 16 bit lines. 

The number of latches to be allocated for each bit 
line set of the 256 bit lines is determined by the time 
array constant and the clock frequency For a common 
16 to 64 MB DRAMS, the array time constant is 32 ns. 
When the data clock frequency is 250 MHz (4ns), from 
the above expression (2) it is apparent that only a 1 6-bit 
burst transfer need be performed. To do this, 16 latches 
must be provided for each bit line set of the 256 bit lines, 
and must prefetch to prepare for a 16-bit burst transfer. 
As is described above, the arrangement in Fig. 2 is only 
an example acquired by employing a specific array time 
constant and a specific clock frequency. The present in- 
vention is not limited to the arrangement in Fig. 2. 

In this invention, an activation and precharge oper- 
ation for a memory array is performed while data are 
being latched. This is the same as the above described 
background art. In the configuration in Fig. 2, 1 6 bits are 
latched during an array time constant period of 32 ns. 
Every 16 ns the latches output one bit set having a 4-bit 
width to each of the four output data paths via read buff- 
ers 36, 37, 38 and 39, which are connected to the re- 
spective output data paths 32 to 35. And every 4 ns (250 
MHz) the data are output to the exterior across the out- 
put data paths by using the output latches and the output 
buffers. In this manner, the 16-bit burst output can be 
completed within a total of 64 ns, and expression (2) can 
be satisfied. 

For a write mode, the 4-bit latches 26, 27, 28 and 
29 shown in Fig. 1 are provided for input. The 4-bit latch- 
es drive the input data paths every 1 6 ns to store bit sets 
having a 4-bit width in the write latch 11. When all 16 
bits have been stored, data writing to the memory array 
is performed. 

The processing for the circuit in Fig. 2 will now be 



described in detail. In a read mode, a switch RG(U) 61 
or a switch RG(L) 62 goes high. When the switch RG 
(U) 61 goes high, the memory array 1 is selected. When 
the switch RG(L) 62 goes high, the memory array 2 is 
5 selected. Fig. 3 is a timing chart in a read mode when 
the switch RG(U) 61 goes high. Since the switch RG(U) 
61 goes high, the TRUE (T)/COMPLEMENT (C) line of 
a sense amplifier connected to the memory array 1 is 
connected to the read latch 42 across the switch RG(U) 
10 61. Since the switch RG(L) 62 does not go high, the 
memory array 2 is not connected to the read latch 42. 
In this condition, 1-bit data from the sense amplifier is 
latched in advance by the read latch 42. A read latch 42 
is provided for every 16 bits of the sense amplifier and 
is a total of sixteen bits are provided for a memory array 
(256 bit width). As is described above, one of the read 
buffers 36, 37, 38 and 39 for external output are provid- 
ed for every 64 bits, and is connected to the read latches 
via switch RG1 , RG2, RG3 and RG4 (not shown except 
20 for the latch 42). Hereinafter, a sense amplifier unit of 
64-bit lines is called a block for convenience sake. 

By employing the block concept, the present inven- 
tion is configured as follows. The sense amplifier having 
a 256 bit width is constituted by four blocks having 64 
25 bit widths. The buffers 36, 37, 38 and 39 are respectively 
connected to the four blocks for outputting. These buff- 
ers can increase the driving force for the output, but for 
them to perform the latch bit function is not inevitable. 
Each block has four read latches 42. The four read latch- 
30 es 42 are connected toone data path (e.g., the data path 
32) across a connected buffer to each block and to the 
switches RG1, RG2, RG3and RG4. The switches RG1, 
RG2, RG3 and RG4 go high sequentially at intervals one 
quarter the length of the cycle time required for the 
35 switch RG(U) to go high, as is shown by the timing chart 
in Fig. 3. When the switch RG1 goes high, for example, 
bits stored in the read latch 42 (D1 ) are output externally 
from the buffer 36. Then, the switch RG 2 goes high, 
and bits stored in the connecting read latch 42 (D2) are 
^0 output externally from the buffer 36. This process is re- 
peated for the switches RG3 and RG4, and data D3 and 
D4 are output externally via the buffer 36. This operation 
is performed for each block. Since there are four blocks 
in the memory array (256 bit width) and four correspond- 
45 hg buffers and data paths are arranged, data are output 
externally as units of four bits, as is shown in Fig. 3. More 
specifically, in respective blocks, corresponding data to 
the condition where the switch RG 1 goes high are out- 
put in parallel through the respective buffers 36, 37, 38 
so and 39 by the output latches 46, 47, 43 and 49 that are 
connected to the buffers. As is shown in Fig. 1, since 
the data from the output latches 46, 47, 48 and 49 are 
finally output externally through one output pin 31, four 
sets of parallel data are output externally in a burst mode 
55 while timings are gradually delayed by using a clock four 
times as fast. 

The write mode will now be described. In the write 
mode, data are input in advance to the input latches 25, 
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27, 23 and 29 at a timing four times as fast. As switches 
WG1 , WG2, WG3 and WG4 go high sequentially, as is 
shown in Fig. 4, four bits of data are stored in the write 
latch 41 and in corresponding write latches (not shown) 
in the other blocks. More specifically, there are four 
blocks in the memory array and they are connected to 
the data paths 22, 23, 24 and 25 respectively. The four 
switches WG1, WG2, WG3 and WG4 are connected to 
each data path. Therefore, when the switch WG1 goes 
high relative to the data paths 22, 23, 24 and 25, as is 
shown in Fig. 4, 4 bits of data (D1 ) are stored in each of 
the four write latches. This process is repeated as the 
switches WG2, WG3 and WG4 go high in order, and a 
total of 16 bits data are stored, one to each of the 16 
write latches. When the switch WG(U) goes high after 
this processing sequence has been completed, 16 bits 
of data are stored in the memory array in a burst mode. 
As is shown in Fig. 4, since all of the data bits are stored 
in the write latches at the same time as the switch WG4 
goes high, the switch WG(U) can go high at the same 
time the switch WG4 goes high. 

Fig. 6 is a timing chart for explaining the operation 
of a DRAMS system according to the present invention, 
compared with a conventional SDRAM system (Fig. 5). 
The burst length for the conventional SDRAM system is 
four and the burst length for the DRAMS system of the 
present invention is sixteen, at a clock frequency for 
both of them is 125 MHz. 

Referring to Fig. 5, in the SDRAM, row address R1 
and column address C1 are determined by the leading 
edges (activation) of RAS and CAS. Based on these ad- 
dresses, four data bits are output continuously in a burst 
mode. But when a 4-bit burst has been completed, as 
the activation and precharge operation for a memory ar- 
ray takes much time, the designation of the next row ad- 
dress R2 and column address C2 is delayed. Therefore, 
a succeeding 4-bit burst cannot be performed following 
a preceding 4-bit burst. That is, when the same bank is 
accessed, a seamless operation cannot be performed. 

Referring to Fig. 6, forthe conventional SDRAM, the 
row address and the column address are designated by 
the leading edges of the RAS and CAS. Data reading 
begin in a 16-bit burst mode based on the first address 
(R1, C1). As is described above, the 16-bit burst is per- 
formed by a read latch group that operates separately 
from the memory array. During the 16-bit burst transfer, 
operations required for the next burst transfer, such as 
an activation and precharge operation for a memory ar- 
ray, can be performed. These memory operations are 
completed when the address (R3, C3) for a following 
data reading is designated, because the period required 
for the 16-bit burst satisfies the above expression (1 ). 

When the access is only for reading, a seamless 
operation can be assured with a shorter burst length 
than the timing shown in Fig. 6. However, actually the 
writing operation is also performed as needed in addi- 
tion to the reading. In the present invention, expression 
(2) where the burst length is longer than that in expres- 



sion (1) is employed. Then, an address for writing can 
be specified during the read data burst transfer, and writ- 
ing can be performed at the same time as the burst 
transferor reading. Referring to Fig. 6, the reading burst 
transfer address is specified by the row address R1 and 
the column address C1 . Before the burst transfer is com- 
pleted, burst transfer addresses R3 and C3 for a follow- 
ing data reading are designated, and the burst transfer 
addresses R2 and C2 for writing are selected. Thus, 
even when the writing operation is interrupted during the 
burst transfer for reading, the reading operation will not 
be halted. This is because the burst length that satisfies 
the expression (2) is employed, and because the mech- 
anism for independently performing the reading and 
writing is employed. The read latch 1 1 and the write latch 
12 are designed to be operated independently in order 
to separately perform the reading and writing opera- 
tions. 

According to the present invention, provided is a 
DRAMS system thai can prevent a substantial reduction 
in bandwidth with respect to a clock pulse frequency 
even when banks are accessed in no specific order. As 
a result, provided is a memory system constituted by 
DRAMS whereby a seamless operation is assured not 
only for reading but also for writing. 

Claims 

1. A memory system, comprising: 

a memory array (1) consisting of a plurality of 
memory devices, 

an input data path (22-25) for inputting external 
data, 

an output data path (32-35) for externally out- 
putting data, 

an input data bit storage mechanism (26-29) lo- 
cated between said memory array and said in- 
put data path, and 

an output data bit storage mechanism (46-49) 
located between said memory array (1) and 
said output data path (32-35) characterised in 
that: 

data bits read from said memory array (1) are 
held in said output data bit storage mechanism 
(46-49) for output externally across said output 
data path (32-35) and a process required for an 
immediately following data reading in said 
memory array is performed; and 
said input data bit storage means and said out- 
put data bit storage means can be operated re- 
spectively. 

2. The memory system according to claim 1 , wherein 
external data is held in said data bit storage mech- 
anism via said input data path, and is then written 
to said memory array. 
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3. A data transfer method for exchanging data with an 
external device, for a memory system that compris- 
es a memory array (1) consisting of a plurality of 
memory devices, an input data path (22-25) for in- 
putting external data, an output data path (32-35) 5 
for externally outputting data, an input data bit stor- 
age mechanism (26-29) located between said 
memory array (1) and said input data path (22-25), 
and an output data bit storage mechanism (46-49) 
located between said memory array (1) and said 10 
output data path, characterised by 

a step of transferring data bits in advance from 
said memory array (1 ) to said output data bit 
storage mechanism (46-49) and of performing is 
a first burst output; 

a step of, during said first burst output, perform- 
ing an operation required for a following data 
reading in said memory array (1 ), and of acquir- 
ing a following read address; and 20 
a step of transferring more data bits from said 
memory array (1 ) to said output data bit storage 
mechanism (46-49) to seamlessly perform a 
second burst output relative to said first burst 

Output. 25 

4. The data transfer method according to claim 3, 
whereby said burst transfer bit length is set so that 
a relationship between an array time constant (t^, 
which is a period needed for an operation required so 
for a following data reading in said memory array, 
and a period (t 2 ), which is needed a performance of 
a burst transfer in accordance with a predetermined 
burst transfer bit length, is 2t-, < t 2 . 

35 

5. A data transfer method for exchanging data with an 
external device, for a memory system that compris- 
es a memory array (1) consisting of a plurality of 
memory devices, an input data path (22-25) for in- 
putting external data, an output data path (32-35) 40 
for externally outputting data, an input data bit stor- 
age mechanism (26-29) located between said 
memory array (1) and said input data path (22-25), 
and an output data bit storage mechanism (46-49) 
located between said memory array (1) and said 45 
output data path (32-35), characterised by; 

a step of transferring data bits in advance from 
said memory array (1) to the output data bit 
storage mechanism (46-49) and of performing so 
a first burst output; 

a step of, during said first burst output, perform- 
ing an operation required for a following data 
reading in said memory array (1 ), and of acquir- 
ing a following read address; 55 
a step of transferring more data bits from said 
memory array (1 ) to said output data bit storage 
mechanism (46-49) to seamlessly perform a 



second burst output relative to said first burst 
output; 

a step of storing external data bits in said input 
data bit storage means (26-29); and 
a step of storing in said memory array (1 ) data 
bits received from said input data bit storage 
means (26-29) 

6. The data transfer method according to claim 5, 
whereby a process at said step for storing in said 
memory array (1 ) data bits received from said input 
data bit storage means (26-29) is performed without 
being affected by timings for the said first and said 
second burst outputs. 
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